Training Context-Dependent DNN Acoustic Models Using Probabilistic Sampling
نویسندگان
چکیده
In current HMM/DNN speech recognition systems, the purpose of the DNN component is to estimate the posterior probabilities of tied triphone states. In most cases the distribution of these states is uneven, meaning that we have a markedly different number of training samples for the various states. This imbalance of the training data is a source of suboptimality for most machine learning algorithms, and DNNs are no exception. A straightforward solution is to re-sample the data, either by upsampling the rarer classes or by dowsampling the more common classes. Here, we experiment with the so-called probabilistic sampling method that applies downsampling and upsampling at the same time. For this, it defines a new class distribution for the training data, which is a linear combination of the original and the uniform class distributions. As an extension to previous studies, we propose a new method to re-estimate the class priors, which is required to remedy the mismatch between the training and the test data distributions introduced by re-sampling. Using probabilistic sampling and the proposed modification we report 5% and 6% relative error rate reductions on the TED-LIUM and on the AMI corpora, respectively.
منابع مشابه
Gmm-free Dnn Training
While deep neural networks (DNNs) have become the dominant acoustic model (AM) for speech recognition systems, they are still dependent on Gaussian mixture models (GMMs) for alignments both for supervised training and for context dependent (CD) tree building. Here we explore bootstrapping DNN AM training without GMM AMs and show that CD trees can be built with DNN alignments which are better ma...
متن کاملGaussian free cluster tree construction using deep neural network
This paper presents a Gaussian free approach to constructing the cluster tree (CT) that context dependent acoustic models (CDAM) depend on. Over the last few years deep neural networks (DNN) have supplanted Gaussian mixture models (GMM) as the default method for acoustic modeling (AM). DNN AMs have also been successfully used to flat start context independent (CI) AMs and generate alignments on...
متن کاملGMM-derived features for effective unsupervised adaptation of deep neural network acoustic models
In this paper we investigate GMM-derived features recently introduced for adaptation of context-dependent deep neural network HMM (CD-DNN-HMM) acoustic models. We improve the previously proposed adaptation algorithm by applying the concept of speaker adaptive training (SAT) to DNNs built on GMM-derived features and by using fMLLR-adapted features for training an auxiliary GMM model. Traditional...
متن کاملAsynchronous, online, GMM-free training of a context dependent acoustic model for speech recognition
We propose an algorithm that allows online training of a context dependent DNN model. It designs a state inventory based on DNN features and jointly optimizes the DNN parameters and alignment of the training data. The process allows flat starting a model from scratch and avoids any dependency on a GMM acoustic model to bootstrap the training process. A 15k state model trained with the proposed ...
متن کاملExploiting Eigenposteriors for Semi-Supervised Training of DNN Acoustic Models with Sequence Discrimination
Deep neural network (DNN) acoustic models yield posterior probabilities of senone classes. Recent studies support the existence of low-dimensional subspaces underlying senone posteriors. Principal component analysis (PCA) is applied to identify eigenposteriors and perform low-dimensional projection of the training data posteriors. The resulted enhanced posteriors are applied as soft targets for...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017